Goto

Collaborating Authors

 one-layer neural network



The Measure of Deception: An Analysis of Data Forging in Machine Unlearning

Dixit, Rishabh, Hui, Yuan, Saab, Rayan

arXiv.org Machine Learning

Motivated by privacy regulations and the need to mitigate the effects of harmful data, machine unlearning seeks to modify trained models so that they effectively ``forget'' designated data. A key challenge in verifying unlearning is forging -- adversarially crafting data that mimics the gradient of a target point, thereby creating the appearance of unlearning without actually removing information. To capture this phenomenon, we consider the collection of data points whose gradients approximate a target gradient within tolerance $ε$ -- which we call an $ε$-forging set -- and develop a framework for its analysis. For linear regression and one-layer neural networks, we show that the Lebesgue measure of this set is small. It scales on the order of $ε$, and when $ε$ is small enough, $ε^d$. More generally, under mild regularity assumptions, we prove that the forging set measure decays as $ε^{(d-r)/2}$, where $d$ is the data dimension and $r


revised version of our article. 2 R# 2: My only relatively minor issue with the paper range of the earlier layers)

Neural Information Processing Systems

W e thank all reviewers for their comments. As you point out, we incorrectly asserted that it suffices to consider a one-layer neural network. Thank you for this interesting comment, we will clarify/fix these issues in the final version of our paper. I think that in Line 84, the authors should add a reference to "A Geometric Analysis of Phase Retrieval". R#3: We really appreciate your positive feedback!


An effective and efficient green federated learning method for one-layer neural networks

Fontenla-Romero, Oscar, Guijarro-Berdiñas, Bertha, Hernández-Pereira, Elena, Pérez-Sánchez, Beatriz

arXiv.org Artificial Intelligence

Nowadays, machine learning algorithms continue to grow in complexity and require a substantial amount of computational resources and energy. For these reasons, there is a growing awareness of the development of new green algorithms and distributed AI can contribute to this. Federated learning (FL) is one of the most active research lines in machine learning, as it allows the training of collaborative models in a distributed way, an interesting option in many real-world environments, such as the Internet of Things, allowing the use of these models in edge computing devices. In this work, we present a FL method, based on a neural network without hidden layers, capable of generating a global collaborative model in a single training round, unlike traditional FL methods that require multiple rounds for convergence. This allows obtaining an effective and efficient model that simplifies the management of the training process. Moreover, this method preserve data privacy by design, a crucial aspect in current data protection regulations. We conducted experiments with large datasets and a large number of federated clients. Despite being based on a network model without hidden layers, it maintains in all cases competitive accuracy results compared to more complex state-of-the-art machine learning models. Furthermore, we show that the method performs equally well in both identically and non-identically distributed scenarios. Finally, it is an environmentally friendly algorithm as it allows significant energy savings during the training process compared to its centralized counterpart.


Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent

Goel, Surbhi, Gollakota, Aravind, Jin, Zhihan, Karmalkar, Sushrut, Klivans, Adam

arXiv.org Machine Learning

We prove the first superpolynomial lower bounds for learning one-layer neural networks with respect to the Gaussian distribution using gradient descent. We show that any classifier trained using gradient descent with respect to square-loss will fail to achieve small test error in polynomial time given access to samples labeled by a one-layer neural network. For classification, we give a stronger result, namely that any statistical query (SQ) algorithm (including gradient descent) will fail to achieve small test error in polynomial time. Prior work held only for gradient descent run with small batch sizes, required sharp activations, and applied to specific classes of queries. Our lower bounds hold for broad classes of activations including ReLU and sigmoid. The core of our result relies on a novel construction of a simple family of neural networks that are exactly orthogonal with respect to all spherically symmetric distributions.